Inventi Impact: Bioinformatics

Current Issue : April - June
Volume : 2020
Issue Number : 2
Articles : 6 Articles

Articles

Inventi:ebi/30909/20

A-Lister: A Tool for Analysis of Differentially Expressed Omics Entities Across Multiple Pairwise Comparisons

Stanislav A Listopad, Trina M Norden-Krichmar

>Research Download Full Text

Background: Researchers commonly analyze lists of differentially expressed entities (DEEs), such as differentially\nexpressed genes (DEGs), differentially expressed proteins (DEPs), and differentially methylated positions/regions\n(DMPs/DMRs), across multiple pairwise comparisons. Large biological studies can involve multiple conditions,\ntissues, and timepoints that result in dozens of pairwise comparisons. Manually filtering and comparing lists of\nDEEs across multiple pairwise comparisons, typically done by writing custom code, is a cumbersome task that\ncan be streamlined and standardized.\nResults: A-Lister is a lightweight command line and graphical user interface tool written in Python. It can be\nexecuted in a differential expression mode or generic name list mode. In differential expression mode, A-Lister\naccepts as input delimited text files that are output by differential expression tools such as DESeq2, edgeR, Cuffdiff,\nand limma. To allow for the most flexibility in input ID types, to avoid database installation requirements, and to\nallow for secure offline use, A-Lister does not validate or impose restrictions on entity ID names. Users can specify\nthresholds to filter the input file(s) by column(s) such as p-value, q-value, and fold change. Additionally, users can\nfilter the pairwise comparisons within the input files by fold change direction (sign). Queries composed of\nintersection, fuzzy intersection, difference, and union set operations can also be performed on any number of\npairwise comparisons. Thus, the user can filter and compare any number of pairwise comparisons within a single\nA-Lister differential expression command.\nIn generic name list mode, A-Lister accepts delimited text files containing lists of names as input. Queries\ncomposed of intersection, fuzzy intersection, difference, and union set operations can then be performed across\nthese lists of names.\nConclusions: A-Lister is a flexible tool that enables the user to rapidly narrow down large lists of DEEs to a small\nnumber of most significant entities. These entities can then be further analyzed using visualization, pathway\nanalysis, and other bioinformatics tools....
Read More

Inventi:ebi/30911/20

An Integrative Methodology Based on Protein-Protein Interaction Networks for Identification and Functional Annotation of Disease-Relevant Genes Applied to Channelopathies

Milagros MarÃn, Francisco J Esteban, Hilario RamÃrez-Rodrigo, Eduardo Ros, MarÃa JosÃ© SÃ¡ez-Lara

>Research Download Full Text

Background: Biologically data-driven networks have become powerful analytical tools that handle massive,\nheterogeneous datasets generated from biomedical fields. Protein-protein interaction networks can identify the\nmost relevant structures directly tied to biological functions. Functional enrichments can then be performed based\non these structural aspects of gene relationships for the study of channelopathies. Channelopathies refer to a\ncomplex group of disorders resulting from dysfunctional ion channels with distinct polygenic manifestations. This\nstudy presents a semi-automatic workflow using protein-protein interaction networks that can identify the most\nrelevant genes and their biological processes and pathways in channelopathies to better understand their\netiopathogenesis. In addition, the clinical manifestations that are strongly associated with these genes are also\nidentified as the most characteristic in this complex group of diseases.\nResults: In particular, a set of nine representative disease-related genes was detected, these being the most\nsignificant genes in relation to their roles in channelopathies. In this way we attested the implication of some\nvoltage-gated sodium (SCN1A, SCN2A, SCN4A, SCN4B, SCN5A, SCN9A) and potassium (KCNQ2, KCNH2) channels in\ncardiovascular diseases, epilepsies, febrile seizures, headache disorders, neuromuscular, neurodegenerative diseases\nor neurobehavioral manifestations. We also revealed the role of Ankyrin-G (ANK3) in the neurodegenerative and\nneurobehavioral disorders as well as the implication of these genes in other systems, such as the immunological or\nendocrine systems.\nConclusions: This research provides a systems biology approach to extract information from interaction networks\nof gene expression. We show how large-scale computational integration of heterogeneous datasets, PPI network\nanalyses, functional databases and published literature may support the detection and assessment of possible\npotential therapeutic targets in the disease. Applying our workflow makes it feasible to spot the most relevant\ngenes and unknown relationships in channelopathies and shows its potential as a first-step approach to identify\nboth genes and functional interactions in clinical-knowledge scenarios of target diseases.\nMethods: An initial gene pool is previously defined by searching general databases under a specific semantic\nframework. From the resulting interaction network, a subset of genes are identified as the most relevant through\nthe workflow that includes centrality measures and other filtering and enrichment databases....
Read More

Inventi:ebi/30907/20

Estimating Network Changes from Lifespan Measurements Using a Parsimonious Gene Network Model of Cellular Aging

Hong Qin

>Research Download Full Text

Background: Cellular aging is best studied in the budding yeast Saccharomyces cerevisiae. As an example of a\npleiotropic trait, yeast lifespan is influenced by hundreds of interconnected genes. However, no quantitative methods\nare currently available to infer system-level changes in gene networks during cellular aging.\nResults: We propose a parsimonious mathematical model of cellular aging based on stochastic gene interaction\nnetworks. This network model is made of only non-aging components: the strength of gene interactions declines\nwith a constant mortality rate. Death of a cell occurs in the model when an essential node loses all of its interactions\nwith other nodes, and is equivalent to the deletion of an essential gene. Stochasticity of gene interactions is modeled\nusing a binomial distribution. We show that the exponential increase of mortality rate over time can emerge from this\ngene network model during the early stages of aging.\nWe developed a maximal likelihood approach to estimate three lifespan-influencing network parameters from\nexperimental lifespans: t0, the initial virtual age of the network system; n, the average lifespan-influencing interactions\nper essential node; and R, the initial mortality rate. We applied this model to yeast mutants with known effects on\nreplicative lifespans. We found that deletion of SIR2, FOB1, and HXK2 considerably altered the initial virtual age but not\nthe average lifespan-influencing interactions per essential node, suggesting that these mutations mainly influence the\nreliability of gene interactions but not the overall configurations of gene networks.\nWe applied this model to investigate replicative lifespans of yeast natural isolates. We estimated that the average\nnumber of lifespan-influencing interactions per essential node is 7.0 (6.1-8) and the average estimated initial virtual\nage is 45.4 (30.6-74 ) cell divisions in these isolates. We also found that t0 could potentially mediate the observed\nStrehler-Mildvan correlation in yeast natural isolates.\nConclusions: Our theoretical model provides a parsimonious interpretation of experimental lifespan data from the\nperspective of gene networks. We hope that our work will stimulate more interest in developing network models to\nstudy aging as a pleiotropic trait....
Read More

Inventi:ebi/30912/20

MLW-gcForest: A Multi-Weighted gcForest Model Towards the Staging of Lung Adenocarcinoma Based on Multi-Modal Genetic Data

Yunyun Dong, Wenkai Yang, Jiawen Wang, Juanjuan Zhao, Yan Qiang, Zijuan Zhao, Ntikurako Guy Fernand Kazihise, Yanfen Cui, Xiaotong Yang, Siyuan Liu

>Research Download Full Text

Background: Lung cancer is one of the most common types of cancer, among which lung adenocarcinoma accounts\nfor the largest proportion. Currently, accurate staging is a prerequisite for effective diagnosis and treatment of lung\nadenocarcinoma. Previous research has used mainly single-modal data, such as gene expression data, for classification\nand prediction. Integrating multi-modal genetic data (gene expression RNA-seq, methylation data and copy number\nvariation) from the same patient provides the possibility of using multi-modal genetic data for cancer prediction. A\nnew machine learning method called gcForest has recently been proposed. This method has been proven to be\nsuitable for classification in some fields. However, the model may face challenges when applied to small samples and\nhigh-dimensional genetic data.\nResults: In this paper, we propose a multi-weighted gcForest algorithm (MLW-gcForest) to construct a lung\nadenocarcinoma staging model using multi-modal genetic data. The new algorithm is based on the standard\ngcForest algorithm. First, different weights are assigned to different random forests according to the classification\nperformance of these forests in the standard gcForest model. Second, because the feature vectors generated\nunder different scanning granularities have a diverse influence on the final classification result, the feature vectors\nare given weights according to the proposed sorting optimization algorithm. Then, we train three MLW-gcForest\nmodels based on three single-modal datasets (gene expression RNA-seq, methylation data, and copy number\nvariation) and then perform decision fusion to stage lung adenocarcinoma. Experimental results suggest that the\nMLW-gcForest model is superior to the standard gcForest model in constructing a staging model of lung\nadenocarcinoma and is better than the traditional classification methods. The accuracy, precision, recall, and AUC\nreached 0.908, 0.896, 0.882, and 0.96, respectively.\nConclusions: The MLW-gcForest model has great potential in lung adenocarcinoma staging, which is helpful for\nthe diagnosis and personalized treatment of lung adenocarcinoma. The results suggest that the MLW-gcForest\nalgorithm is effective on multi-modal genetic data, which consist of small samples and are high dimensional....
Read More

Inventi:ebi/30908/20

SCOPIT: Sample Size Calculations for Single-cell Sequencing Experiments

Alexander Davis, Ruli Gao, Nicholas E Navin

>Research Download Full Text

Background: In single cell DNA and RNA sequencing experiments, the number of cells to sequence must be\ndecided before running an experiment, and afterwards, it is necessary to decide whether sufficient cells were\nsampled. These questions can be addressed by calculating the probability of sampling at least a defined number of\ncells from each subpopulation (cell type or cancer clone).\nResults: We developed an interactive web application called SCOPIT (Single-Cell One-sided Probability Interactive\nTool), which calculates the required probabilities using a multinomial distribution (www.navinlab.com/SCOPIT). In\naddition, we created an R package called pmultinom for scripting these calculations.\nConclusions: Our tool for fast multinomial calculations provide a simple and intuitive procedure for prospectively\nplanning single-cell experiments or retrospectively evaluating if sufficient numbers of cells have been sequenced.\nThe web application can be accessed at navinlab.com/SCOPIT....
Read More

Inventi:ebi/30910/20

The Assessment of Efficient Representation of Drug Features Using Deep Learning for Drug Repositioning

Mahroo Moridi, Marzieh Ghadirinia, Ali Sharifi-Zarchi, Fatemeh Zare-Mirakabad

>Research Download Full Text

Background: De novo drug discovery is a time-consuming and expensive process. Nowadays, drug repositioning is\nutilized as a common strategy to discover a new drug indication for existing drugs. This strategy is mostly used in\ncases with a limited number of candidate pairs of drugs and diseases. In other words, they are not scalable to a large\nnumber of drugs and diseases. Most of the in-silico methods mainly focus on linear approaches while non-linear\nmodels are still scarce for new indication predictions. Therefore, applying non-linear computational approaches can\noffer an opportunity to predict possible drug repositioning candidates.\nResults: In this study, we present a non-linear method for drug repositioning. We extract four drug features and two\ndisease features to find the semantic relations between drugs and diseases. We utilize deep learning to extract an\nefficient representation for each feature. These representations reduce the dimension and heterogeneity of biological\ndata. Then, we assess the performance of different combinations of drug features to introduce a pipeline for drug\nrepositioning. In the available database, there are different numbers of known drug-disease associations corresponding\nto each combination of drug features. Our assessment shows that as the numbers of drug features increase, the\nnumbers of available drugs decrease. Thus, the proposed method with large numbers of drug features is as accurate as\nsmall numbers.\nConclusion: Our pipeline predicts new indications for existing drugs systematically, in a more cost-effective way and\nshorter timeline. We assess the pipeline to discover the potential drug-disease associations based on cross-validation\nexperiments and some clinical trial studies....
Read More

Call Us: +4 (800) 888-0008

Inventi Impact: Bioinformatics

Articles

Inventi:ebi/30909/20

A-Lister: A Tool for Analysis of Differentially Expressed Omics Entities Across Multiple Pairwise Comparisons

Inventi:ebi/30911/20

An Integrative Methodology Based on Protein-Protein Interaction Networks for Identification and Functional Annotation of Disease-Relevant Genes Applied to Channelopathies

Inventi:ebi/30907/20

Estimating Network Changes from Lifespan Measurements Using a Parsimonious Gene Network Model of Cellular Aging

Inventi:ebi/30912/20

MLW-gcForest: A Multi-Weighted gcForest Model Towards the Staging of Lung Adenocarcinoma Based on Multi-Modal Genetic Data

Inventi:ebi/30908/20

SCOPIT: Sample Size Calculations for Single-cell Sequencing Experiments

Inventi:ebi/30910/20

The Assessment of Efficient Representation of Drug Features Using Deep Learning for Drug Repositioning

Links

Contact Us